Multiple Reduced Phoneme Sets for Second Language Speech Recognition
نویسنده
چکیده
This paper describes a novel method to improve the performance of second language speech recognition when the mother tongue of users is known. Considering that second language speech usually includes less fluent pronunciation and more frequent pronunciation mistakes, I propose using a reduced phoneme set generated by a phonetic decision tree (PDT)-based top-down sequential splitting method instead of the canonical one of the second language. However, the proficiency of second language speakers varies widely, the optimal phoneme set is dependent on the proficiency of the second language speaker. In this work, I verify the efficacy of reduced phoneme set in the first step, and examine the relation between the proficiency of speakers and a reduced phoneme set customized for them in the second step. On the basis of the results of the investigation, I propose a novel speech recognition method using multiple reduced phoneme sets to further improve the recognition performance considering the various proficiency levels of second language speakers. Experimental results demonstrate the high validity of the proposed method.
منابع مشابه
Second language speech recognition using multiple-pass decoding with lexicon represented by multiple reduced phoneme sets
Considering that the pronunciation of second language speech is usually influenced by the mother tongue, we previously proposed using a reduced phoneme set for second language when the mother tongue of speakers is known. However, the proficiency of second language speakers varies widely, as does the influence of mother tongue on their pronunciation. Consequently, the optimal phoneme set is depe...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملPhoneme Set Design Considering Integrated Acoustic and Linguistic Features of Second Language Speech
Recognition of second language speech is still a challenging task even for state-of-the-art automatic speech recognition (ASR) systems. Considering that second language speech usually includes less fluent pronunciation and mispronunciation even when it is grammatically correct, we propose a novel phonetic decision tree (PDT) method considering integrated acoustic and linguistic features to deri...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملSpeaker Clustering for Multilingual Synthesis
Today, speech synthesizers in new languages are typically built by collecting several hours of well recorded speech in the target language. The time and effort involved in collection and correction can be prohibitive when lack of resources is common in addressing under-represented languages. An alternative method is to use acoustic data from an existing synthesizer in a different language and t...
متن کامل